Spotting Prosodic Boundaries in Continuous Speech in French

نویسندگان

  • Vincent Pagel
  • Noëlle Carbonell
  • Yves Laprie
  • Jacqueline Vaissière
چکیده

Two kinds of marks have been set by the listeners (frontiers and accents), which are attached to the syllable nucleus. The MLP fed with any of the previously described values (F0,duration...), no matter the size of the temporal window, is not capable of reproducing the accent marking with a good score. Thus we consider that listeners' accent marks are not consistent, at least from a local point of view. But for the frontier marks, the MLP fed with the duration, on a 5 vowel context , achieves the task with 11% insertion and 43% omissions. Phonetician marks At this stage, we use the auditory marks to select a significative subset of marks set by the expert. Considering the given number of mark types obtained, we found it necessary to gather them in generic classes to achieve a correct training of the MLP : R for initial rise (129 occurrences), P for peaks (128), B for baseline (105), C for continuation rise (50), Nil for no marking at all (1287). After several tests, we kept vowel duration, F0 values, and pseudo-syllable duration on a 7 vocalic nucleus window to feed a MLP with 10 neurons in its hidden layer. The MLP has 5 outputs: one for each class mentioned above. Nil 227 6 0 4 3 B 7 20 0 0 0 C 0 5 7 1 1 P 5 0 1 25 5 R 0 0 0 5 34 The MLP gives no answer for 44 configurations (concurrent answers). Surprisingly, no nasality tag is required to draw the MLP attention on the fact that nasal vowels are much longer than vocalic ones. RESULTS AND CONCLUSION The main result is that this experience validates both the expert prosodic marking and the automatic spotting system. Furthermore, the confusion rate between P and R marks is rather low, which agrees with the results of [4]: lengthening is a more important correlate of F0 peak for P than for R. R marks recognized as P, are accented monosyllabics words. The recognition rate for C is enhanced when we add F0 regression parameters, as involved vowels bear a long upward F0 move. However this adds a slight confusion in the identification of P marks. Future work will aim at incorporating long term prosodic variations in the modelling of our prosodic marks. REFERENCES [1] J. Vaissière (1982), «A supra-seg-mental component in a French speech recognition system: …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosodic Trees for Boundary Detection in ASR in French

Prosodic trees as a hierarchical representation of prosodic organization in French proved to be efficient for automatic processing of continuous speech. We applied this technique to the prosodic boundary detection on the output of a speech recognition application in order to test whether prosodic boundaries of different levels in tree confirm or not recognition hypotheses. Two types of tree con...

متن کامل

Prosodic hierarchy and spectral realization of vowels in French

The aim of this study is to relate spectral realization of vowels and prosodic hierarchy in continuous speech. The IRISA speech alignment system is used and formant values of oral vowels are automatically measured in a total of 500,000 segments from around 30 hours of journalistic broadcast speech in French. The link between the duration of vowels and their spectral realization (through their f...

متن کامل

The Role of Prosodic Information in L2 Speech Segmentation

Unlike written language, where word boundaries are often denoted by blank spaces (e.g., le_chat „the_cat‟), for spoken language, no single device allows for the reliable identification of word boundaries: words are typically uttered without a pause between them, and sound processes further blur word boundaries. A crucial challenge for second/foreign language (L2) learners is that the cues to wo...

متن کامل

Effects of the Native Language on the Learning of Fundamental Frequency in Second-Language Speech Segmentation

This study investigates whether the learning of prosodic cues to word boundaries in speech segmentation is more difficult if the native and second/foreign languages (L1 and L2) have similar (though non-identical) prosodies than if they have markedly different prosodies (Prosodic-Learning Interference Hypothesis). It does so by comparing French, Korean, and English listeners' use of fundamental-...

متن کامل

Prosody in a corpus of French spontaneous speech: perception, annotation and prosody ~ syntax interaction

Our study focuses on the issue of prosodic annotation and of the prosody ~ syntax interface in conversation and is based on a large corpus of conversational speech in French. The results of inter-transcriber agreement tests show that two expert transcribers are consistent in their labeling of prosodic phrasing and the consistency is well above the chance. A qualitative analysis reveals transcri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cmp-lg/9808014  شماره 

صفحات  -

تاریخ انتشار 1998